Visualization-enabled multi-document summarization by Iterative Residual Rescaling

نویسندگان

  • Rie Kubota Ando
  • Branimir Boguraev
  • Roy J. Byrd
  • Mary S. Neff
چکیده

This paper describes a novel approach to multi-document summarization, which explicitly addresses the problem of detecting, and retaining for the summary, multiple themes in document collections. We place equal emphasis on the processes of theme identification and theme presentation. For the former, we apply Iterative Residual Rescaling (IRR); for the latter, we argue for graphical display elements. IRR is an algorithm designed to account for correlations between words and to construct multi-dimensional topical space indicative of relationships among linguistic objects (documents, phrases, and sentences). Summaries are composed of objects with certain properties, derived by exploiting the many-to-many relationships in such a space. Given their inherent complexity, our multi-faceted summaries benefit from a visualization environment. We discuss some essential features of such an environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

iDVS: An Interactive Multi-document Visual Summarization System

Multi-document summarization is a fundamental tool for understanding documents. Given a collection of documents, most of existing multidocument summarization methods automatically generate a static summary for all the users using unsupervised learning techniques such as sentence ranking and clustering. However, these methods almost exclude human from the summarization process. They do not allow...

متن کامل

SUTLER: Update Summarizer Based on Latent Topics

This paper deals with our past and recent research in text summarization. We went from single-document summarization through multidocument summarization to update summarization. We describe the development of our summarizer which is based on latent semantic analysis (LSA). The classical LSA-based summarization model was improved by Iterative Residual Rescaling. We propose the update summarizati...

متن کامل

iNeATS: Interactive Multi-Document Summarization

We describe iNeATS – an interactive multi-document summarization system that integrates a state-of-the-art summarization engine with an advanced user interface. Three main goals of the system are: (1) provide a user with control over the summarization process, (2) support exploration of the document set with the summary as the staring point, and (3) combine text summaries with alternative prese...

متن کامل

A Language Independent Algorithm for Single and Multiple Document Summarization

This paper describes a method for language independent extractive summarization that relies on iterative graph-based ranking algorithms. Through evaluations performed on a single-document summarization task for English and Portuguese, we show that the method performs equally well regardless of the language. Moreover, we show how a metasummarizer relying on a layered application of techniques fo...

متن کامل

SentTopic-MultiRank: a Novel Ranking Model for Multi-Document Summarization

Extractive multi-document summarization is mostly treated as a sentence ranking problem. Existing graph-based ranking methods for key-sentence extraction usually attempt to compute a global importance score for each sentence under a single relation. Motivated by the fact that both documents and sentences can be presented by a mixture of semantic topics detected by Latent Dirichlet Allocation (L...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Natural Language Engineering

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2005